Evolutionary Data Preprocessing to Alleviate Class Imbalance

نویسندگان

چکیده

Intrusion detection technology for network attacks is developing rapidly with the development of artificial intelligence technology. Recently, machine learning-based methods that can detect new types have been developed. To improve classification performance rare classes in intrusion dataset, we study efficient data preprocessing method based on learning. The UNSW-NB15, a well-known used experiments. dataset includes 9 attack and has severe class imbalance overlap, so it difficult to above certain level. by adjusting number instances needed. SMOTE techniques genetic algorithms are optimize ratio between training dataset. computation time reduced creating samples only few percent UNSW-NB15 Many datasets generated small according randomly ratios. experiments conducted these datasets. A combining results experiments, regression model best tuple ratios searched applying as fitness function algorithm. D-S-1G combination exhibited among test results. It consists decision tree classifier support vector regressor (SVR). As result, was significantly reduced, optimal showed better than experimental original found result each relies heavily type classifier.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On dynamic ensemble selection and data preprocessing for multi-class imbalance learning

Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the majority class which has a large number of instances. Ensemble of classifiers have been reported to yield promising results. However, the majority of ensemble metho...

متن کامل

Class Imbalance Problem in Data Mining Review

In last few years there are major changes and evolution has been done on classification of data. As the application area of technology is increases the size of data also increases. Classification of data becomes difficult because of unbounded size and imbalance nature of data. Class imbalance problem become greatest issue in data mining. Imbalance problem occur where one of the two classes havi...

متن کامل

A Proposal of Evolutionary Prototype Selection for Class Imbalance Problems

Unbalanced data in a classification problem appears when there are many more instances of some classes than others. Several solutions were proposed to solve this problem at data level by undersampling. The aim of this work is to propose evolutionary prototype selection algorithms that tackle the problem of unbalanced data by using a new fitness function. The results obtained show that a balanci...

متن کامل

On the effectiveness of preprocessing methods when dealing with different levels of class imbalance

0950-7051/$ see front matter 2011 Elsevier B.V. A doi:10.1016/j.knosys.2011.06.013 ⇑ Corresponding author. E-mail addresses: [email protected] (V. García), s [email protected] (R.A. Mollineda). The present paper investigates the influence of both the imbalance ratio and the classifier on the performance of several resampling strategies to deal with imbalanced data sets. The study focuses on evaluat...

متن کامل

Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem

Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Security and Communication Networks

سال: 2022

ISSN: ['1939-0122', '1939-0114']

DOI: https://doi.org/10.1155/2022/3761205